STCP: Simplified-Traditional Chinese Conversion and Proofreading
نویسندگان
چکیده
This paper aims to provide an effective tool for conversion between Simplified Chinese and Traditional Chinese. We present STCP, a customizable system comprising statistical conversion model, and proofreading web interface. Experiments show that our system achieves comparable character-level conversion performance with the state-of-art systems. In addition, our proofreading interface can effectively support diagnostics and data annotation. STCP is available at http://lagos.lti.cs.cmu.edu:8002/
منابع مشابه
基於對照表以及語言模型之簡繁字體轉換 (Chinese Characters Conversion System based on Lookup Table and Language Model) [In Chinese]
The character sets used in China and Taiwan are both Chinese, but they are divided into simplified and traditional Chinese characters. There are large amount of information exchange between China and Taiwan through books and Internet. To provide readers a convenient reading environment, the character conversion between simplified and traditional Chinese is necessary. The conversion between simp...
متن کاملKey Problems in Conversion from Simplified to Traditional Chinese Characters
In this paper we tackle the problem of character conversion from simplified Chinese to traditional Chinese. Of those simplified characters that need conversion, about 9.5% of them have more than 2 counterparts in the traditional scripts. We improve upon the previous log-linear approach first used in (Chen et al 2011) by utilizing more data sets and better translation models. We also show that a...
متن کاملAspergillus nidulans stcP encodes an O-methyltransferase that is required for sterigmatocystin biosynthesis.
The Aspergillus nidulans stcP gene was previously identified as a transcribed region associated with a cluster of genes proposed to be involved in sterigmatocystin biosynthesis (D. W. Brown, J.-H. Yu, H. S. Kelkar, M. Fernandes, T. C. Nesbitt, N. P. Keller, T. H. Adams, and T. J. Leonard, Proc. Natl. Acad. Sci. USA 93:1418-1422, 1996). stcP was predicted to encode a methyltransferase responsibl...
متن کاملThe perception of simplified and traditional Chinese characters in the eye of simplified and traditional Chinese readers
Expertise in Chinese character recognition is marked by analytic/reduced holistic processing (Hsiao & Cottrell, 2009), which depends mainly on readers’ writing rather than reading experience (Tso, Au, & Hsiao, 2011). Here we examined whether simplified and traditional Chinese readers process characters differently in terms of holistic processing. When processing characters that are distinctive ...
متن کاملParsing Simplified Chinese and Traditional Chinese with Sentence Structure Grammar
We present a challenge to parse simplified Chinese and traditional Chinese with a same rule-based Chinese grammatical resource--Chinese Sentence Structure Grammar: CSSG, which was developed based on a new grammar formalism idea: Sentence Structure Grammar: SSG. We participate in the simplified Chinese parsing task and the traditional Chinese parsing task of CLP 2012 with a same rule-based chart...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017